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00 (57) Abstract: The present invention relates to a system for identifying, isolating and utilizing promoter elements useful for expres- 
C sion of nucleotide sequences and the proteins encoded thereby in a thermophile. In one embodiment, a recombinant DNA molecule* 
provided, and comprises a reporter sequence, a putative thermophile promoter, a selectable marker sequence, and a 3' and a 5 DNA 
° targeting sequence that are together capable of causing integration of at least a portion of said DNA molecule into the genome of a 
O thermophile. Further, within the recombinant DNA, the reporter sequence is under the transcriptional control of a prater which 
functions in a thermophile to form a promoter/reporter cassette, the promoter/reporter cassette is flanked by said 3 and said 5 DNA 
j> targeting sequences, and the promoter/reporter cassette is positioned in the opposite orientation of the DNA targeting sequences. 
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FTFT D nF THE INVENTION 

The present invention relates to the identification and utilization of promoters for 
5 expression of nucleic acid sequences in thermophiles. 

u ArKCROITNP 0* ™. INVENTION 

The high-temperature operating conditions of certain industrial processes in the areas 
of pharmaceutical synthesis, b.odegradation of complex agricultural and " waste 

10 compounds, and food processing dictate the use of thermostable enzymes that can funcuon * 
nigh temperatures. A significant advantage thermostable enzymes prov.de are cost savmgs 
resulting from longer storage stability and the higher activity at high temperature 

Thermostable enzymes have traditionally been used for sacchanfication in food 
processing and proteolysis in detergent industry (Williams, R. A. D. 

15 pphcations of the genus Tkermus . Thermophiles: Science and Technology Reyk av* 
Iceland, 1992). Glucosidases are utilized extensively throughout the starch proces^g 
in dustry. Thermostable carbohydrases are directly involved in the manufacture of all starch- 
denved products. Isomersases are involved in production of high-fructose corn syrup. Two 
other Jportant industrial carbohydrases are the pectolytic enzymes and lactase, ( Burgess K- 

20 andM.Shaw. mln^^st^^ 

1983- Bombouts, F. M. and W. Pilnik. in Mirn>hi«1 enz yme, and InoccmveisKiM . Ed by 
A. H Rose NY p. 269. 1980). A recent development in the industrial enzyme area is the 
use of celluiase for the production of glucose from cellulose (Mandels, M. In Anjmalrep^ 

w hv (IT Tsao NY. p. 35. 1982). Proteolytic enzymes, 
on fermentation processes. Ed. by G. isao, rs.i. p. * 

25 constituting a significant segment of the total industrial enzyme marke are utilized in the 
detergent industry (Godfrey, T. and J. Reichelt. (1983) Industrial enzymology.). 

One of the most promising new applications of thermostable enzymes is in the 
manufacture of specialty chemicals and pharmaceutical intermediates. Enzymes (or 
biocatalysts) are now being viewed as clearly superior in cases where stereospecific synthetic 
30 reactions are involved, such as synthesis of chiral compounds as pharmaceutical 
intermediates. Enzymes can carry out the reaction more specifically and under coupons 
which are safer for the environment. Thermostable enzymes have advantages smce they are 
generally more stable in organic solvents, can carry out reactions at high temperatures where 
substrate and product solubility is higher, and can be recycled and used for longer periods of 
35 time because of their inherent stability. 

A relatively new application of thermostable enzymes is PCR-based diagnostics. 
Thermostable polymerases have been extremely useful in the detection and molecular 
characterization of agents causing cancer, AIDS, and numerous other infectious diseases. 
Thermostable DNA-polymerase can already compete in market value with enzymes having 
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„ ctah , P DNA replication proteins have important applications 

traditional applications. Thermostable DNA replication v 

O Stetter Thermophilic bacteria. In Thermophilic bacteria, bd. t>y 

temperature range of 45 to 85 C and not req * r ^^^5: Science and 

, , D. Biotechnological ^ « * *** - 

Technology, ReykjavdUceland, 1992). These sttam s gr taelessUian2 

hours, the microorganisms of the genus inermu* 

^n2n* the widespread interest ,„ 77,^ cultures for a vanery of app.ica.ions Acre 
^ Despite the wioespreau u Currently the expression of 

femrenration strains which have been meubohcally pharmaceutical 
in Woproce* applications such as product 'T^^^La*** 
mediates. Mdidonally. me systems provmeunc™ _ 

to altered 



thermostability. 



30 



35 



VV nv n F sCRIPTION OF THF f>R AWINQS 
FIGURE 1. C-^--«^I— 

— — - - "r^P-t of promoter test vectors. A) 
FIGURE 3. Construction of P^OO and _de p ^ ^ ^ ^ ^ 

Comparison of tenninator sequences from Thermus. Ihe ms 
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• . r „_ r mli shuttle vector with the Thermus 
r xr">nn nTG200 consists of an E. coli snume vcv 
construction of pTG200. B) piuzuu cu OT , D0S ite direction. A strong 

TT.ermw transcription terminator is piacea Promo ter-test vectors were 

^riprion through Ure gene in the W-^ ^ c „ ended 50 . 60 bp 
cameled by using prime* to the mo ends of the kan gene 
promoter attached at the 5 'end. 

SUMM a pv OF THE im^EjSIIQN 
. ^f^^Z^T^^, isolating and utilizing 
The present invention relates to a system ^ 
/• , fnr ^..oion of nuc eotide sequences ana xne p.ui^ 
promoter elements useful for expression ded) ^ 

" ^ a seiectah.e ma*er 

a 7vl " A £E ^ J that are rogerher capabie of causing 
sequence, and a 3 and a S DNA ^ » S ^ rf , , hermophl , e . 

nnegranon of a, to. a ponton of sard DNA mole tral , scnp „„„al 
Funh er. within the recombinant DNA , promoler/repon cr cassene. 

cntrol of a promorer which functrons m a rhermophrle to fto , P ^ 
ft. promorcr/reponer cassene is flanked by sard 3 and s* DNA g g J 
*. reponer casaen. is positioned ^oter 

— to r" ^OTnopmle wit^fl^ abov^described recombinant DNA molecule 

,„ pro morers which have been identified by die above memod. 

DEtaiLEEjffiacBiEmaa 

s Wiflnin fids appiicarion, unto orhenvise ^^^^l 

s^r ^r-w -de, a r ^ ~ 

t ~, ins Academic Press, San Diego, CA iwi). 
^ Tj^ Tle^ods ,n Enz^y, Academic Press. San Drego, 
0 Puriflcanon" m Der»shcer MP . ■£ r M ™ „ «** W • 

CA PN* d CA ; C ^Treferences,isauen parents and pending paten, 

35 is Evolved m me bmdmg of RNA M „ c orgmism Mch as 
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/» w. rnrtine refers to the representation of amino acids, 
individual coding segments ("exons ). Coding relers wine P 

start and stop signals in a three base "triplet" code. Promoters are often upstream ( 5 to > 
Z^o^on site of the corresponding gene. Other regulatory sequences of 
d" L to promoters are known, including sequences involved with the binding o 
lenption factors' including response elements that are the DNA — 
inducibl factors. Enhancers compnse yet another group of regu atory sequenc o DNA tha 
can increase the utilization of promoters, and can function in either onenta io n£4 « 3 -M 
and in any location (upstream or downslream) relative to the promoter. Preferably, the 
"latory sequence has a positive activity, i.e., binding of an endogenous hgand (e.g. a 
Monitor) to the regulatory sequence increases transenpuon, — » 

mcnJd expression of the corresponchng target gene. A promoter may also mclude or be 
^ntto rregulatory sequence !„ the art as a silence, A silencer sequence 
generally has a negative regulatory effect on expression of the gene. 

8 rov.ded herein are methodologies and reagents for ^^^T^T 
systems for efficient expression of nucleotide sequences in a host organism. Preferabl y. fte 
sysiemb iui cm r p n Varva In one embodiment, and 

hos. cdl is a member of the kingdom Bac.ena, Atchea or Eukarya I" °n 
fo, fte purpose of .esting .he consrmcu provided herein, ■• rs preferred fta. the 
co,, MoTprefcraMy, U is preferred ma, fte expression sys,ems compnse promo e fcmems 
capabie of regulating gene expression in a rhennophiie. Even - pretobly, ft ho« efl « 
a member of the genus THenms, and most prefembiy .be bos. cel. .s of .he ^. »»» 

f^ion prov.de* reagents and meftodologies useM for identify promo.ers havng 

activity in a themrophile, preferably of fte genus Thermm. 

Using fte reagents and techniques described in ftis apphcauon, mdncflfle and 
using » Wc Klds conaimng 

constitutive promoters, inlegrattve and plasmid-baseri vectors, an 

and characterization of a promote,. For instance, fte vecftra ^J*££t 
bacteriophage, vires, phagemid, cuintegrat. of on. or more spec.es, * 
vector is amenable to expression of a nucleotide sequence tn a proknryoUc cell such as 
, rlL oTTl. ^.briber preferable fta, fte vectors be copable of funcoomngm 

diireren.typesofcells<ie,shuttle),snchasn,e™usor£.col.. 

„ is also possibte to use fte present invent™ ft accompbsh modd paftway 
engineering and/or fte^ostabihzarion using tandem expression systems. The 
engmecrmg aiiu; _ _™si,Yn vectors ftat aUow for vanous levels of 

invention provides promoter sequences and expresston vcaors ma. . 
5 expression in a neophilia hos. oefl. The .evels of expsessmn nray 

inherent properties of fte premoter itself or art. fte ass.stanc« of an addftomu reguW»y 
^uence L avdUbihty of such premoter sequences wifl provi . a crucal tool for use . 
driving expression of thermostable enzymes for use in multiple apphcanons. 
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As would be understood by the skilled artisan, die development of efficient 
thomophtle expression systems is important for several applictrtions including for example 
the isoLon and use of nucleic acid sequences from extreme- or hyper-thermophtles that may 
IT, : efficiently expressed ,„ mesopbilic systems. Thousands of genes - 
5 ORFs have been discovered from these indushiaHy important organrams through genome 
fencing projects. High-temperature fere— strains wtth altered metabo am maybe 
Zu in lL,Lm applications for dre production of pharmaoeuUcal tntermedtates The 
^ invjtion pmvides tools for thermosubilizing mesophtlic proteins by select™ rn 
Zl and allows for dre identification of the sequence determinants tnvolved m 

10 " " the reagents and medtodologies provrded herein may be utitized for the 

nove, genes from genome sequencing projects has stgnificandy expanded » 
„ thermoLbleenzymes. T-.«^M-~-^— 
such as Sulfolobus sulfoaericus. Pyococcus /uncus, 

. 1_ . » 4* 1 * r-s A J-1 



20 



25 



30 



35 



industrial processes where thermostable enzymes may be utilized. 

,n under to be a commercially viable enzyme, the enzyme must be capable of bang 
produced and recovered in large quantities in an organism with 
heterologous host is useful because of the difficulties in growing hyperdten.ophr.es and £ 
« of aysrems for cloning and gene analysis for such strains, frt one embodunen Hhe 
present invention, ft. hos, cell is B. col, Howler, in certain sttuattons E. co tsa sub- 
'optima, host for hypcrthennophUe gen. expression, ha several snrd.es. » was shown ^ 
cLn hyperthermophibc proteins could no. assemble properly n E col, 
Discove^and Production of Recombinant Gene Expression in the Martne *P-a«£» 
^1 ^us. TV 7* — Symposmm on * - — 

Microorganisms., Montreal, 1»4; Udennan, et al. <»» 
Hyperthermophilic Arehaebacterium P>™co<*«s JW». ««"• <*«. M8 J4402 
2^)71.ion, me remporamre optima of the proteins being shrdted are Really too 
high to petmi. an analysis of their function in vivo in me mesophilic orgtmtsm. Thus, tn a 
preferred erebodimen, me host c.1 is from me genus 7Wes. Temnereture-dependen 
folding and activity of pnteins from hyperdtermophiles make species of toe genus Then™ 
morepreferredlwstforexpressionofsuchprottms. 

2 Engineered fermematum strains for high-umperamre b,oproces,es. Many 
industrial bioprocess* utilize whole-cell fermentation techniques. In many <->»« *"* 
of an isolated enzyme system is too expensive or impractical. Many enzymes, such . 

intlediates, require co-factors such as NAD(P) to for their renins. Cofactots are utiltzed 
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stoicMometncally dunng the reaction and must be repeatedly added to the reaction mixture or 
the reaction must regenerate the cofactor. A whole-cell system provides a alternative fo 
many of these enzymes. Other enzymes may be membrane-bound or require complex subunit 
or multi-enzyme complexes (such as cytochrome P-450s), allowing or simpler 

5 implementation using a whole-cell system. Finally, the synthesis of complex molecules such 
as steroids, antibiotics, and other pharmaceuticals may require complicated and multiple 
catalytic pathways. In an isolated system, each step would need to be engineered. In 
contrast, the orgamsm utilized in a whole cell system provides each of the reqmred pathways. 
The tools provided herein may be utilized to engineer Thermus with multiple genes, thus 
10 providing the organism with the necessary pathways for carrying out such bioprocesses. 

In addition, it is often desired to carry out synthetic reactions at high temperatures, 
because either the reaction is exothermic (therefore cooling is not needed), the substrate or 
product solubility is greater at higher temperature (which can drastically increase throughput 
the viscosity of the reaction is improved (as in the case in many food applications such as die 

15 processing of cheese whey), or the reaction proceeds s.gnificantly faster at the higher 

temperature. thermostabilization ofm esophilic genes. The thermolability of most 

mesophilic proteins can limit their industrial use. It was proposed in the early 1980s that 
thermostabilization of mesophilic proteins could be accomplished by carrying out activity 
20 selections in organisms which grow at high temperatures ( Matsumur* et al. W 
Enzymatic and Nucleotide Sequence Studies of a Kanamycin-Inactivating Enzyme Encoded 
by a Plasmid from Thermophilic Bacilli in Comparison with That Encoded by Plasmid 
pUBHO. J.Bacteriol. 160:413-420; Liao, et al. 0«^ rf ^"J 
variant by cloning and selection in a thermophile. Proc. Natl. Acad. Sci. USA. 83:576-580) 
25 Applicants have developed several directed evolution methods to accelerate the evolution of 
protein properties such as thermostability through the use of both in vivo and in vitro 
techniques. Directed evolution relies on a random, but targeted, approach to generating 
mutations of interest. By carrying out sequential generations of random mutagenesis on a 
gene of coupled with selection or screening for the resulting proteins, numerous proteins with 
30 improved properties have been developed. In each generation, a single variant is generaUy 
chosen as the parent for the next generation, and sequential cycles allow the evolution of the 
desired features. Alternatively, effective mutations identified during one or more generations 
can be recombined using methods such as DNA shuffling' (represented by, for example 
sexual PCR). Traits which have been enhanced may include but are not limited to unproved 
35 substrate specificity, catalytic activity, activity in the presence of organic solvents, expression 
level and stability. 

Liao et al ((1986) Isolation of a thermostable enzyme variant by cloning and 
selection ina thermophile. Proc. Natl. Acad. Sci. USA. 83:576-580) first demons*** in 
vivo thermostabilization of a gene by using kanamycin nucleotidyl transferase ,n Baallus 
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stearothermophilus where resistance to 63°C was shown. To improve the genetic 
^stabilization approach, a gene transfer system for Thermus was developed where the 
upper growth limit was above 80°C instead of 65>C as in Bacillus (descnbed in, for example, 
US 5 786 174 which is hereby incorporated by reference). These experiments were initially 

5 conducted using the thermostabilized kan gene, in which the initial Km' supported growth 
only to 55°C in Thermus and not to 63°C as reported by Liao, et. al. ((1986) Isolation of a 
thermostable enzyme variant by cloning and selection in a thermophile. Proc. Natl Acad. 
Sci USA 83:576-580). The regulated expression system provided herein allows for fine- 
tuning of thermostabiUzation selection experiments so that the temperature range can be 

10 regulated and controlled and cutoff temperatures for selection adjusted in subsequent rounds 

of mutagenesis. , 

Some important elements of Thermus' genetic background have been previously 
described. The generation of mutations (Koyama, et al. (1990) Cloning and sequence 
analysis of tryptophan synthetase genes of an extreme thermophile, Thermus thermophilus 
15 HB27: Plasmid transfer from replica-plated Escherichia coli 

competent T. thermophilus cells. J. Bacterial 172:3490-3495; *W * * 1990) * 
plasmid vector for an extreme thermophile, Thermus thermophilus. FEMS Mtcrobtol. Lett. 
72-97-102- Lasa, et al. (1992) Insertional mutagenesis in the extreme thermophilic eubactena 
Thermus thermophilus HB8. Molec. Microbiol. 6:1555-1564), chromosomal integration 
20 (Koyama, et al. (1990) Cloning and sequence analysis of tryptophan synthetase genes of an 
extreme thermophile, Thermus thermophilus HB27: Plasmid transfer from replica-plated 
Escherichia coli recombinant colonies to competent T. thermophilus cells. J. Bactenol 
172-3490-3495; Koyama, et al. (1990) A plasmid vector for an extreme thermophile 
Thermus thermophilus. FEMS Microbiol. Lett. 72:97-102; Lasa, et al. (1992) Insert^ 
25 mutagenesis in the extreme thermophilic eubactena 77,^ thermophilus HB8. Molec. 
Microbiol. 6:1555-1564), plasmids ( Mather, et al. (1990) Plasmid-associated aggregation » 
Thermus thermophilus HB8. Plasmid. 24:45-56; Hishinuma, et al. (1978) Isolation of 
extrachromosomal deoxyribonucleic Acids from extremely thermophilic bacteria Jour, of 
General Microbiology. 104:193-199.), and phages (Sakaki, et al. (1975) Isolation and 
30 Characterization of a Bacteriophage Infectious to an Extreme Thermophile, Thermus 
thermophilus KB*. J.Virol. 15:1449-1453) have also been studied. Several successful 
attempts to develop cloning systems using plasmids and chromosomal integration systems 
were demonstrated (Koyama, et al. (1986) Genetic transformation of the . extreme 
thermophile thermus thermophilus and of other thermus spp. J Bacteriol. 166:338-340; 
35 Lasa, et al. (1992) Development of Thermus-Escherichia Shuttle Vectors and Their Use for 
Expression of the Clostridium thermocellum celA Gene in Thermus thermophilus. J. 
Bacteriol 174:6424-6431; Mather, etal. (1992) Development of Plasmid Cloning Vectors 
for Thermus thermophilus HB8: Expression of a Heterologous, Plasmid-Bome Kanamycin 
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Nucleotidyltransferase Gene. Appl. Environ. Microbiol. 58:421-425.). However, none of 
these provide the versatility as those provided herein. 

4 Thermus expression signals. More than twenty genes from Thermus species 
have been cloned and sequenced (Table 1). However, none of these sequences illustrate the 
5 optimal regulatory elements needed to develop a useful system for expression in a 
thermophile such as Thermus. Applicants have previously documented the nucleotide 
sequences encoding phosphatases, glycolytic enzymes (6-glucosidases and B-galactos.dases), 
and biosynthetic genes (pyrE. hisB. leuB) from Thermus by complementation of their 
functions or by testing the specific activities (Weber, et al. (1995) A chomosome integration 
10 system for stable gene transfer into Thermus Jlavus. Bio/Technology, Vol. 13(3): 271-275). 
Expression of these sequences in E. coli demonstrated that most Thermus genes are capable 
of being transcribed in E. coli. The comparison of nucleotide sequences from some Thermus 
genes and operons reveals putative translation initiation signals that are similar to known E. 
coli motifs (Table 1 shows exemplary sequences). 

15 
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Table 1 



Putative translation initiation signals of known genes 


of the genus 


Thermus 


Ret 




Gene ~ 


Start 


KbS 


-10 region 


-35 region 




Organism 




codon 






n.t. 


120) ' 


T.aquaticus V 1 1 


Uactate DH 
aqual 


ATG 


n.i. 
AGGAG 


n.i. 
TAGCTT 


TTGACA 


(21) (22) 


T.aquaticus B 


sucD 
mdh 


GTG 
GTG 


GGAGGTG 
AAGGAG 


n.f. 
n.f. 

nT ~ 


n.f. 
n.f. 

nTT ~~ 


(23) 
(23) 


i .tntrtnopnllUS nDO 


JuJB " 
tufA 
icdh 
sipA 
Xylose 


ATG 
ATG 

GTG 


AGGAGGA 
GGAGG 
AAGGAGGTG 
AGGAGG 


n.f. 
TGTAGT 
TACGAT 

n.f. 


n.f. 
TTACAA 
TTGACA 

n.f. 


(25) 
(26) 
(27) 
(28) 




isomerase 
gOX 
nox 
16S RNA 
23S/5SRNA 
NADH dh 
4.5SRNA 


ATG 
ATG 


GAGG 

n f 


ATAAT 

n.f. 
TAGCAT 
TATCTT 


n.f. 

n.f. 
TTGACA 
TTGACA 


(29) 
(30) 
(3D 
(3D 






AAGGAGGGG 


TAAGAT 
TATACT 


TTGCGC 
TAGCCT 


(3D 


T.thermophilus HB27 




a Tr* 
AlO 

GTG 


AGGGAG 
GGGAGG 


TAGGAT 

n.f. 


TTTACC 

n.f. 

nT — 


(ID 
(ID 


T.Jlavus A 1 62 


sucA 
mdh 


GTG 
GTG 


GGAGG 
AAAGGAGG 


n.f. 
n.f. 


n.f. 
n.f. 

TTGACA 


(32) 
(32) 




ntsA/intB 


' ATG " 


AAUUAGGKj 
AAGGG 


" TATAA1 
TAGACT 


TTGTAG 
t i u 'jtJJ 1 




1 n.i. - no homology to coi> tound; Kb!> - alburn, binding sue; -.u anu 



The expression system provided herein comprises certain genetic elements whose 
configuration has been manipulated to ensure high levels of protein synthesis m a 
thermophilic host such as Thermus. In one embodiment, the central element of the expression 
system is a promoter positioned upstream of a ribosome-binding site, RBS, which is further 
10 under the control of a regulatory gene. Provided herein are several exemplary T. 
thermophilic promoters having activity in T. flaws. In another embodiment, novel promoter 
regions from bacteria belonging to genus Thermus are provided. In yet another embodiment, 
a promoter probe vector for Thermus is provided and is useful for evaluating promoter 
strength. 

15 The isolation of a promoter that functions in a thermophile such as Thermus 

promoters may be accomplished by generating a library of genomic fragments from the 
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cgamsm^mrted « Jon diges. or Cher medrod for generating ~g™cnrs and 
Lertnng ,he fragmems into a reporter vecror. As a source for <he genomtc DNA, one on d 
r. „y surrableTacrenal strain. MM». *. — * " » ""^ " 

—tag a 7to« promoter l.brary, Utmus chromosomal DNA may be pamal.y 

^1 a fining —ease sucb as and a of Ore dtgesred 

CI having^ desfred size <for example, approxtmately fab, ^ * 
JL. arch as .1- tan, an agarose gel These fragments are men . gat ed w « DNA 

from a reporter vector ma, bas bee, digest wtb an approprtam enzyme for bganon to me 

eOT ° m "on property, rhe reporrer or promoter-probe vector requfr. me foUowmg 
etemenrs- 1) an E. coH origin of replication (ColEI was used in pGEM.bg); 2) a marker g«re, 
mlns m B. coU (snch as me se,ec,,ve drag-resistance marker * tha, confm 

Z** * » • — ^ t r;:™ 

,ertnina,or (TT) upsfream of mereporrer gene. Other ongms -«^«^~ 
marker genes, reporter genes, and terminators can be used as we., to const™, sunuar 

promoler-probe vectors. ^ k „ kho „. derived from a commonly 

The reporter veclor may compr.se a plasmid backbone derived »*» ' 
available vec,or plasmids including bu. no. limited to pBR322, pBR325, pBR327, pUC 8 
UC 9 pUC 4.C, PUC> 8 , pUC19, pi, 18, and piz .9. The rcporrer vecor ,s - 
2 , a ^mble reporter sequence such as ,acZ, tbg, or , dnrg resistence gene . positioned 
ZL- (or 3') from a polylinker she contenting servera, onoVor »on 
enzyme sites. A suitable reporter sequence encodes a gene pmduc, detecteble va 

A number of genes can be used as either markers to detect mserhon of the g«u= tn 
nennus or as reporters which can be used to analyze expression in Thermos. Table 2 
Lins a few exlples of such genes, bu, i, should be undemmod by me stalled ««. d« 
omers may also be sutable. Some of tee genes confer selectable phenotvpes. In tins «*. 
ntedia conditions can be esteblished so *. only colonies which have express™ of these 
genes activated will grow. Other genes confer phenorype. which can be screened. A 
screenable phenotype can often yield information abou, levels of gene express™. 
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Table 2 

Marker/reporter genes which can be used in Thermus. 



Marker I Description 



kantri 



HLADH 



pyrF" 



leuF" 



Thermostable 
kanamycin resistance 
gene 



Hoist Liver Alcohol 
Dehydrogenase. 



Urotodine-i -phospho- 
n bosy I — transferase 



lsopropyi 
dehydrogenase 



malale 



TypT 



"SETT 



Integrative 



"FN" 



Comments ~~ 

Works with a vancry ot promoters and 
successfully used in promoter-test 
experiments Expression level effects 
resistance level in host. 



fcxpression contirmed under control ot 
Leucine promoter in Thermus. Although 
not from a thermophile, the gene is stable 
up to about 70°C. Can be quantitated. 



Selectable marker which can be used as a 
site of integration in Thermus. 



Selectable marker used as a sue ot 
integration in Thermus 



JUT 



Abbreviations: SKL: selectable marker; SCK: screenable marker; ND not determined; hN: functional. 
References: A. US 5,786,174; (Weber, et al. (1995) A chomosome integration system for stable gene tranfer 
5 into Thermus flavus. Bio/Technology, Vol. 13(3), pp. 271-275). B. U.S. Prov. App. No. 60/046,182 filed May 
12,1997. C. US 5,786,174. D. US 5,786,174. 

The vector may contain more than one reporter sequence located 5' and 3' of the 
polylinker region. In this manner, the orientation of a promoter region at the polylinker site 

10 will not affect expression of the reporter sequence. The genomic fragments are ligated into 
the polylinker region using standard techniques. In one embodiment, the promoter sequences 
are amplified by PCR using primers containing, for example, an EcoRI site, -35, -10 and 
several downstream residues (to include the +1 transcription site). The amplified sequences 
are then cloned into a TG200 reporter sequence by digesting the PCR fragment with EcoRI 

15 and HindlH, followed by subcloning into the pTGeporter vector by digesting the PCR 
fragment with EcoRI and HindlQ. 

In a preferred embodiment, a library of T. thermophilic chromosomal fragments is 
generated by restriction enzyme digestion and cloned into a reporter vector. The reporter 
sequence is positioned downstream (or 3') of the polylinker region such that insertion of a 

20 promoter sequence into the polylinker region will result in expression of the reporter 
sequence. 

Following construction of the reporter vector, the vector may then be transformed into 
a host cell such as E. coli and screened for promoter activity. Those fragments showing 
promoter activity in E. coli are then transformed into a thermophile to detect promoter 
25 activity in the thermophile. For sequences with promoter activity in E. coli identified from a 
T. thermophilus chromosomal fragment library, the reporter vector is preferably subsequently 
transformed into T. thermophilus, and expression of various markers and model enzymes 
assayed. 
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In another embodiment, a reporter vector is utilized which has an integrative element 
capable of driving integration of the reporter sequence into the genome of the host organism. 
Preferably, the reporter sequence is positioned in the opposite orientation of the integrative 
sequences such that occlusion transcription does not occur. As a further safeguard, it is 
preferable that a transcriptional termination (TT) sequence consisting of a inverted repeat 
with an ATI region comprising a repeat region for pausing an RNA polymerase where AT 
serves to separate it from the DNA template, be included in the integration sequence. Similar 
to the reporter vector described above, the reporter sequence is preferably positioned 
downstream of a polylinker site into which a putative promoter sequence may be inserted. 
For example, an integrative vector may comprise portions of the Thermus leuB gene both 5' 
and 3' of a cassette containing a drug resistance gene adjacent to a putative promoter 
sequence. If the putative promoter sequence is capable of driving gene expression in 
Thermus, the host cell will gain resistance to a compound by virtue of expression of the 
reporter sequence (ie, drug resistance gene) controlled by the putative promoter sequence. 
The absence of drug resistance indicates that the promoter is not active in Thermus. 

In another embodiment, sequencing of the Thermus genome may be performed and 
putative promoter sequences identified using computerized searching algorithms. For 
example, a region of a Thermus genome may be sequenced and analyzed for the presence of 
putative promoters using Neural Network for Promoter Prediction software, NNPP. NNPP is 
a time-delay neural network consisting mostly of two feature layers, one for recognizing 
TATA-boxes and one for recognizing so called "initiators", which are regions spanning the 
transcription start site. Both feature layers are combined into one output unit. These putative 
sequences may then be cloned into a reporter vector suitable for preliminary characterization 
in E. coli and/or direct characterization in Thermus. 

To optimize the promoter sequence, the length of the promoter sequence can be 
optimized by performing deletion analysis, such as by using an endonuclease (such as Exol or 
Bal31) to create sequential deletions in the promoter sequence or by generating a series of 
oligonucleotides with shortened sequences from each side of the isolated promoter sequence. 
The individual deletions can then be tested for activity and expression from each of the 
promoter regions can be quantitated to determine the minimal sequence needed to confer 
expression. This minimal promoter region can then be used to express genes of interest in 
Thermus. 

The reagents and methodologies provided herein also provide for the identification of 
egulated promoters and regulatory elements. By exchanging certain promoter elements from 
one reporter vector to another using standard molecular biology techniques, specific 
sequences having certain regulatory effects (ie, increase or decrease expression) on 
expression of a sequence may be identified. For instance, following identification of a 
promoter region within a DNA fragment using the techniques described above, certain 
portions of the promoter may be deleted or excised from the DNA fragment, and the modified 
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promoter re-tested. In the event that expression is observed after this modification, and 
determination of whether expression has increased or decreased following modification, a 
positive or negative regulatory element of the promoter may be identified. In addition, 
specific regions of the putative promoter element may be isolated and tested in isolation. In 
5 this manner, specific elements may be identified that regulate gene expression in the host cell. 
In addition, various regulatory elements identified as described above may be combined into 
novel promoter sequences. It is also possible to use the techniques described herein to 
construct hybrid regulated promoters and vectors for regulated expression by combining one 
or more regulatory elements with a promoter sequences not typically associated with that 

10 regulatory elements. The hybrid promoters can then be tested for activity and expression from 
each of the promoter regions can be quantitated to determine the minimal sequence needed to 
confer expression. The hybrid promoter region may then be used to express genes of interest 
in Thermus. Thus, the development of efficient regulated promoters for expression of 
nucleotide sequence in a thermophile is provided by the instant invention. 

15 Trans- acting regulatory elements may also be identified by screening the libraries in 

E.colL As will be understood by the skilled artisan, these elements can be placed on different 
plasmids and both will remain functional. 

The constructs described herein may also be utilized to construct optimal expression 
systems for the production of industrially important thermophilic model proteins including 

20 but not limited to lipases, esterases, hydrogenases, and proteases. In addition, the constructs 
can be utilized to generate bacterial strains with multiple chromosomal insertions and 
characterize such strains for use in fermentations. 

The following Examples are for illustrative purposes only and are not intended, nor 
should they be construed as limiting the invention in any manner. Those skilled in the art 

25 will appreciate that variations and modifications can be made without violating the spirit or 
scope of the invention. 
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EXAMPLES 
Example 1 

Screening a library of T. thermophilus chromosomal fragments for sequences with 

promoter activity in E. coli 
5 A. Assembly of a promoter probe vector for selection of Thermus promoters in E. 
colL 

One strategy for discovery of Thermus promoters is to first identify a promoter from 
Thermus which functions in an intermediate strain, such as E. coli, and then test the 
promoters which have been identified in Thermus. Performing this two-step process can 

10 potentially dissociate the promoter from a regulatory element and help identify Thermus 
promoters that may be tightly controlled. 

For primary selection of Thermus chromosomal fragments exhibiting promoter 
activity in £. coli, the promoter probe vectors pGEMtbg and pVUFlOtbg (Fig. 1) were 
constructed. The promoter-probe vectors utilized herein include: 1) an E, coli origin of 

15 replication (ColEI was used in pGEMtbg); 2) a marker gene which functions in E. coli (the 
selective drug-resistance marker bla confemng ampicillin resistance was used in pGEMtbg); 
3) a promoterless reporter gene (tbg was used in pGEMtbg); and, 4) a transcriptional 
terminator (TT) upstream of the reporter gene. 

In the construction of pGEMtbg, the tbg gene of T aquaticus encoding Thermo-B- 

20 galactosidase (Tbg) was used as a reporter sequence. Tbg expression can be detected using 
several possible chromogenic substrates such as 5-Bromo-4-Chloro-3-indolyl-6-D- 
galactopyranoside (X-Gal) and 5-Bromo-4-Chloro-3-indolyl-8-D-glucopyranoside (X-Glc) to 
identify clones exhibiting 6-glucosidase (or B-galactosidase) activity. Expression of E. coli B- 
glucosidase and B-galactosidase activities are tightly controlled under uninduced conditions. 

25 In addition, the background activity of the endogenous enzyme is insufficient to turn colonies 
blue and £. coli lacZ B-galactosidase mutants are common. Tbg also demonstrates 
thermostability, which facilitates assay of the enzyme's activity in crude cell lysates. Heating 
of lysates for 15 minutes at 65°C totally inactivates endogeneous activity, making the 
detection of low activities of Tbg possible. To incorporate the tbg gene into pGEM, it was 

30 amplified by PCR. 

The tbg gene was isolated from a preparation of Thermus aquaticus genomic DNA. 
Primer sequences used for the PCR amplification of the tbg gene to construct pGEMtbg 
included primer 187 which contained a Pmll, BstEII restriction sites followed by the trp 
transcriptional terminator (underlined) followed by Bell, SnaBI, Nhel, FscI, AvrH restriction 

35 sites, followed by a sequence homologous to the 5' end of the tbg gene (bold) started with a 
putative ATG site as depicted below. Primer 227 contained sequence homologous to the 3' 
end of the tbg gene. 
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1 87 5 '-CACGTGGTTA CCCGCCTAAT G AGCGGGCTT TTTTTTGATC 
ATACGTAGCT AGCCCCGGCC GGCCTAGGAT GGCAATTATT 
CAATTTC-3* (SEQIDNO:l) 

5 227 5 '-TTAATATTCA AACCATTTAT TTTCTAT (SEQ ID NO: 2) 

The 5' end primer was designed so that tbg, upstream of the Shine-Delgarno (SD) site, 
had unique SnaBl and Bell sites for cloning blunt-ended DNA fragments and fragments 
obtained as a result of partial digestion with Sau3Al. The strong transcription terminator of 
E. coli trp operon was included upstream of the cloning sites. The PCR fragment was 
10 subcloned into the pGEM-T vector (obtained from Promega) to generate pGEMtbg using 
standard techniques. pGEM-T allows direct cloning of PCR products without the need for 
restriction digestion. Clones with the proper orientation of the gene were determined by 
restriction analysis. 

Plasmid pVUFlOtbg (Fig. 1) was constructed by inserting the kanamycin drug 
15 resistance marker kantr2 (Weber, et al. (1995) A chomosome integration system for stable 
gene tranfer into Thermus flavus. Bio/Technology, Vol. 13(3): 271-275) into the Nde I to 
Not I site of pGEMtbg. The fragment containing the kanamycin gene was prepared by 
amplification with primers 388 and 442 listed below. The fragment was digested with NotI 
and Ndel and cloned into pGEM-tbg which had also been digested with Ndel and NotI. 

20 

442 Containing the Nde I (bold), NstEII, and Pmll site followed by homology to 
the kanamycin gene: 

5 'TGGTTACC AT ATGGTAACCA CGTGAATGGA CCAATAATAATG 
25 (SEQ ID NO: 3) 

388 Containing the NotI site (bold) followed by the rmC transcriptional terminator 
(underlined) and homology to the kanamycin gene: 

30 5'GTTATCTGA * a czccicic.C.(1C TTTCAGAT AAA A AAAATCCTTAGCT 

TTCGCTAAGriATnGATTTCTGGCT CAAAATGGTATGGTTTTGAC-3' 

(SEQ ID NO: 4) 

The resultant PCR fragment was then isolated and inserted into pGEMtbg plasmid 
using standard molecular biology techniques. 

35 

B. Construction and screening of a T. thermophUus genomic library in E. coli 

To construct a Thermus promoter library, T. thermophUus chromosomal DNA was 
partially digested with &w3AI and a fraction of the digested fragments of 1 kb size were 
purified by elution from an agarose gel. These fragments were then ligated with DNA from 
40 the pVUFlO cloning vector which has been digested with Bell for cloning. T. thermophUus 
was used as the source of genomic DNA since the T. thermophUus and T. flavus strains 
utilized are highly related, but not identical. The ligated DNA was used to transform E. coli. 
Transformed cells were cultured on LB agar containing X-glc at 50 ug/ml, and cultured for 
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several days at 37°C. Recombinant clones exhibiting promoter activity (approximately one 
percent of all recombinant clones) developed color of different shades of blue due to tbg. Out 
of several hundred of blue colonies, 24 were randomly selected for further analysis. 

The nucleotide sequences of each insert was determined. The entire insert was 
5 sequenced for clones W12, W18, W51 and W57 and about 50% of the sequence was 
determined for the remaining clones. The sequences of clones 1 and 2 overlapped and were 
combined resulting in the sequence designated VV1-2. Computer analysis of these sequences 
using BLASTN search algorithm revealed putative core promoter regions showing similarity 
to the consensus promoter sequence of £. coli (Table 3). Sequence analysis of promoter 

10 Wl-2 revealed two inverted repeats, one of which was at the transcription-start site. These 
sequences were AT-rich (below 40% GC) compared with the GC-rich content of random 
Thermus DNA which is about 72% GC. This preliminary search was performed using 
Sequencher DNA assembler (commercially available). One of these potential promoters 
matched the known promoter for L thermophilic chaperonin (Figure 3). 

15 Table 3 

Promoters identified from homology search 
"Sequence clone 



TTGACATTCCCCCCGCCCCGGGGTACCCTCCTTCCCGGGAGGCGCGCCTCCCGAGGAGAACGGTACCCATG. . Wl-2 

(SEQ ID NO: 5) 

20 TTGACAAGGGAAAGCCGGGGTGCTAACTTAGGGATTGCGCTGCCCT... WS7 
.... ATACGTAGCTAGCCCCGGCCGGCCTAGGATG . . . (SEQ ID NO: 6) 

TTTATTCGCAAAGCCCCCCGGTGCTATAATGGAAGACGGCGTCTAAACGCCTTCTAGGACCCCTATG. . . W34 

(SEQ ID NO: 7) 

TTGACGCTCCCCCAAAAGCCCCCTTATAATCGCTGTGGAATAGCTTCCAAAGGAGGTACGGTATG. . . W40 
25 (SEQ ID NO: 8) 

TTCTAGAGGCGGCGCTCCGCCTCTATCGCCACCCGGATCATTTACCCCCTCATCAAGGCCACC. . . W37 

(SEQ ID NO: 9) 

TTGACAAAGGCCATGCCTCCTTGGTATCTTCCCTTT7GCGCTGCCCTGAGGGGG. . . W53 

(SEQ ID NO: 10) 

30 TTGACAAGGTCTTCCGCCAGGCCTCCATCCACCACGTCATCGTCCTGGAG ... Wl 8 

(SEQ ID NO: 11) 

TTCGAATCCCTCCGGGCCCGCCATTGTTATCTTGGAAATGGGTAGCCTTT. . . W51 

(SEQ ID NO: 12) 

ATG Start Codons shown m bold font are venlied real start codons as ldentitied by 
35 "CodonUse" which looks for codon usage patterns in open reading frames, "clone" denotes 
promoter clone name. Putative SD, translation start, -35 and -10 sites arc shown in bold. 
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35 



40 



To determine the promoter activity in each of the clones, a Tbg assay was performed, 
as shown in Table 4 below. As shown therein, expression from the promoters varies. 

Table 4 

Expression characteristics of the cloned promoter candidates 



45 



5 














Promoier 


b. coli 




l CTTIL) - muuewun 






Clone 


Average Units 




Homologies & Comments 




wi- 


rssT jH 


■J 


"1.1 




10 


W2 


36 ± 


24 


1.6 


none 


W4 


152 ± 


23 


1.4 


Thermus NADH dh, DNA pol 




W7 


120 ± 


39 


1.2 


none- 




W12 


132 * 


23 


1.4 






W13 


350 ± 


13 


2.1 


glutamate synt, Thermus dh, def. fmt. 


15 


W15 
proC 


150 ± 


8 


1.7 




W18 


97 ± 


2 


1.3 






WHCl 


287 ± 


16 


1.6 






W31 


5836 ± 


511 


1.6 




20 


W32 


67 ± 


15 


1.4 


B. sp wap. lie J 




W33 


87 ± 


20 


1.0 






VV34 


66 ± 


21 


1.5 


23SrRNA»** 




W35 


81 ± 


16 


1.3 


fus?7? 




W36 


40 ± 


12 


1.3 


Ile-tRNA synthetase 


25 


W37 


62 ± 


8 


1.4 


Thermus sip, nox, pol, Zea rbd inac 


W38 


59 ± 


9 


1.6 


Thermus lysyl tRN A synthetase 




W39 


65 ± 


6 


1.4 


ribosomal spacer 




W40 


U8± 


10 


1.2 


mus munculus transcription factor 




W47 


95 ± 


24 


1.1 




30 


W51 


2\0± 


4 


1.4 






W53 


149 ± 


33 


2.0 






W57 
W70 


56 ± 


27 


2.1 


T. thermophilus Chaperonin 




134± 


11 


1.1 


T. thermophilus ribonucleaseH 




W1.2 


2000 ± 


242 


nd 


Tbg with promoter 



Assays were performed as described t or H-galactos.dase M»'er ^u.ier 

in bacterial genetics. Cold Spring Harbor Uboratorjr Press Cold S W JJPJ) 
using o-nitrophenyl galactopyranoside (ONPG) as ia substrate modified tc ^be ran at 65 C to 
assay for Tbg. *Ratio of expression from E. coli host grown at 42°C divided by E. coli host 
grown at 30°C. 

The promoters were then assayed at higher temperatures in E. coli. While a 
temperature sensitive repressor cloned along with the promoter was unexpected, although 
possible, temperature dependence of the promoter could potentially be observed because 
DNA from Thermus is typically GC rich. In addition, it was also possible that the promoter 
would cross-react with E. coli regulatory elements. Three of the promoter clone candidates 
showed twice the level of expression at 42°C. Clone W57 appears to have homology to a 
Thermus heat shock protein (actually a chaperonin) as is shown in Fig. 3. 
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Example 2 

Integrative promoter-identification vector for Thermus 
To evaluate strength of the identified core promoter regions in Thermus, a novel 
integrative promoter test vector was constructed for use in T. flaws. The vector was 
constructed to include a thermostable kanamycin resistance gene (kan«), winch had been 
prevmusly demonstrated to function in Thermus . The vector integrates by a double- 
crossover event so the insertion is stable and permanent. The integrative vector pTGlOOkan 
for use in T. flavus is a suicide vector having the Km* gene as a selective marker, and a leuB 
10 as a region of homology where integration into the chromosome occurs ( Weber, et al. (1995) 
A chromosome integration system for stable gene tranfer into Thermus flavus^ 
Bio/Technology,Vo\. 13(3): 271-275). Other such reporter genes and insert.cn sites could be 

USed aS Well. /in ii*\ A 

A promoter-test vector was redesigned from pTGlOOkan* ( Weber, et al. (1995) A 
15 chomosome integration system for stable gene tranfer into Thermus flavus. Bio/Technology, 
Vol 13(3)- 271-275). In a novel vector, P TG200, the promoterless Km R gene was utilized as 
a reporter gene (Figure 3B). In this vector, Km* is onented in the opposite direction to fa*. 
Therefore, upon integration of a fragment bearing a promoter in front of Km into the T. 
flavus chromosome, simultaneous transcription from the leu and Km* promoters might cause 
20 occlusion transcription, which is a phenomenon observed when transcription through 
promoter inhibits promoter's function. To avoid such interference, the Thermus 
transcriptional terminator was inserted downstream of the Km* gene. A consensus sequence 
of Thermus transcriptional terminator was derived by analysis of a number of Thermus 
terminators as shown in Figure 3A. In this example, the sequence below was used as a 
25 terminator (underlined sequences signify the regions of inverse homology): 

TGCCACCCCATGCTG^TTGC GCCAGCATGGGG GCCCCGGCAAAAGAATTC 
~~ " (SEQ ID NO. 13) 

30 Positioning the promoterless Km* gene in the opposite direction to leuB, pTG200 did not 
confer Km* to T. flavus cells and, therefore, could be used as a promoter probe vector. 

The terminator sequence was derived by a comparison of terminators shown in 
Figure3 A and using the his terminator as a model. A terminator sequence was obtamed as a 
part of a larger fragment amplified by PCR using the following primers: 
35 TR5KAN - Containing Ncol site, transcriptional terminator (underlined), EcoRI site, and 
5' end sequence of the KmR gene: 
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5'-acacacacacaCCATGGcctaa TGCCA CCCCATGCTQGjQTTGC 
Ncol terminator 

GCC AGCATGGGGGCCCCGGCAAAA GAATTCaaagggaatgagaatagtgaatggacc-3 ' 

—-terminator EcoRI 5 'end of KmR gene — 

(SEQIDNO: 14) 

KAN3 - Containing PstI and Hindlll sites and 3* portion of the Km gene 
5'-eagcatggccCTGCAG AAGCTT caaaatggtatgcgttttgacacatcca-3' (SEQIDNO: 15) 
PstI HindHI 3 'end of KmR gene — 

A DNA fragment obtained by PCR amplification of the KmR gene from pTGlOO was 
digested with PstI and Ncol and subcloned into the pTGlOO cleaved with Ncol and Nsil. 

It turned out that the terminator diminished transcription but did not terminate it 
completely so that we could observe weak growth of T. flavus on Kmtransformed with the 
plasmid described above. To avoid this effect, we inverted the kanamycin gene in the 
construct. To invert the gene, we amplified it by PCR using primers 3KM-RI and 5KM-H3 
to obtain a DNA fragment bearing promoterless KmR gene: 

5'-acacacGAATTCcaaaatggtatgcgttttgacacatcc-3' (SEQIDNO: 16) 
EcoRI 

5'-cacacacaAAGCTTtacgtatctagagggaatgagaatagtgaatggacc-3' (SEQIDNO: 17) 
Hindlll 

The fragment was cleaved with EcoRI and Hindlll and subcloned into the described 
above plasmid cleaved by the same enzymes to give pTG200. The resulting plasmid has a 
promoterless KmR gene with a unique Hindlll site upstream and a terminator downstream. 
We did not observe growth of Thermus cells transformed with pTG200 on Km plates. Hence, 
the plasmid can be used as a promoter probe vector in Thermus. 

To check activity of the promoters identified in E. coli, the core regions attached to 
the kanamycin resistance gene were amplified by PCT using primer 3KM-RI and one of the 
following primers: 

W37KM 

acacacAAGC Ttgtagaggc ggcgctccgc ctctatggcc acccggatca tttaccccct catcaaggag gagaatagtg 
Aatggaccaa taatgac (SEQ ID NO: 18) 



SicScAA GCTTgacaaa ggccatgcct ccttggtatc ttcccttttg cgctgccctg aggaggagaa tagtgaatgg 
accaataata atgact (SEQ ID NO: 19) 
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SSSL GCTTgacaag gtcttccgcc aggcctccat ccaccacgtc atcgtcctgg aggaggagaa tagtgaatgg 
accaataata atgact (SEQ ID NO: 20) 

5 W51KM 

acacacAAGC Ttcgaatccc tccgggcccg ccattgttat cttggaaatg ggtagccttt aggaggagaa tagtgaatgg 
accaataata atgact (SEQ ID NO: 21) 

10 aawacacAA GCTTgacaagg gaaagccggg gtgctaactt agggattgcg ctgccctcat acgtaggagg 
agaatagtga atggaccaat aataatgac (SEQ ID NO: 22) 

W12D2 PRIMER t . 

acacacacAA GCTTgacatt ccccccgccc cgccgtaccct ccttcccggg aggaggagaa tagtgaatgg 

15 accaataata atgactag (SEQ ID NO: 23) 

All PCR reactions were performed in a volume of 100 microliters. The reaction 
mixture contained 50 mM KC1, 10 mM TrisCl pH 8.3, 1.5 mM MgCl 2 , 0.2 mM dNTPs (A, C, 
G, T), 2 U Taq DNA polymerase (Perkin-Elmer), 40 pmole of each primer, 100 ng of 

20 template DNA. The thermalcycler repeated the following steps for 30 cycles: 1 minute at 
94°C, 1 minute at 55°C, 1 minute at 72 °C. PCR fragments were cleaved by HindUl and 
EcoRI and subcloned into pTG200 cleaved by the same enzymes. DNA of the resulting 
plasmids was used to transform T. flavus. Transformed cells were plated on LB agar 

containing 20 ug/ml Km. 
25 Promoter Wl-2 was also modified by removal of the larger or both inverted repeats 

because it appeared that this inverted repeat might effect ribosome binding. Putative core 

promoters were placed immediately upstream of the Km R gene and integrated into the T. 

flavus chromosome. Promoters W1-2/D2, W40, W53, and W57 proved functional in T. 

flavus, conferring Km resistance to the cells upon transformation at 20ug/ml. It is possible 
30 that the other promoters may confer resistance below this level, however as the level of 

kanamycin drops to about 10 ug/ml, background growth of Thermus begins to occur. The 

unmodified Wl-2 without the inverted repeat removed did not give expression when tested, 

in contrast to W1-2/D2 which gave the strongest expression. 

35 Wl-2 TTGACATrCCCCCCGCCCCGGGGTACCCTCCTfrrrrinOAGOCGCGCCTCCCGAGGAGAA 
(SEQ ID NO: 24) 

Wl-2 /D2 TTGACATTCCCCCCGCCCCGGGGTACCCTCXTTCCCGGGAGGAGGAGAA (SEQ ID NO: 25) 

40 Figure* Removal of hairpin loop region in Wl-2 promoter. The -35, -10 and SD 
regions are shown in bold. The inverted repeat is underlined. 

To confirm integration of the Km* gene into the T. flavus chromosome had occurred 
in the strains from transformants were obtained, Southern hybridization and PCR analysis of 
45 the promoters was performed. The data indicates that integration of Km' into leuB had 
occurred. To estimate activity of the promoters, clones were cultured on TT agar plates 
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containing various amounts of Km. The data indicated that the modified Wl-2 was the most 
active of the tested promoters, conferring to the cells resistance of up to 1000 ug/ml Km 
(Table 5). While high levels of kanamycin resistance had been reported on multicopy 
plasmids in other organisms, stable resistance to kanamycin at these levels had not been 
previously observed in Thermus. 
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Table 5 



Ability of the tested promoters to confer kanamycin resistance on a promoterless kantr2 
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Example 3 

Integrative promoter-identification vector for Thermus 
Promoters are selected by direct cloning and of libraries into a Thermus strain as well. 
This method avoids the initial characterization of promoter activity in E. coli. Construcdon 
of a Thermus promoter library for direct transformation in Thermus is earned out utilizing a 
screening or selection marker that functions in Thermus. This marker is incorporated mto a 
promoter probe vector capable of either integrating into the Thermus chromosome or bemg 
maintained extrachromosomally on a suitable plasmid in Thermus. A vector such as the 
promoter-test vector P TG200 (described above, Example 2) is one such vector. In this case, 
the promoterless Km" gene is utilized as a reporter gene (Figure 3B). 

In order to use the promoter probe vector, Thermus chromosomal DNA is partially 
digested with a frequent-cutting endonuclease (such as 5a«3AI) and a fracnon of 1 kb 
fragments are purified by elution from an agarose gel. These fragments are then ligated with 
DNA from the promoter-test pTG200 vector which has been digested with an appropnate 
enzyme (Bell) for cloning. The ligated DNA is then used to transform Thermus. The plating 
media contains a suitable amount of the selection (or screening) agent such as kanamycin in 
this example, which is determined by testing a series of concentrations and choosing one just 
strong enough to prohibit growth of untransformed strains. This allows selection or screening 
for promoter sequences which activate the reporter gene. 

To confirm integration of the Km* gene into the T. flavus chromosome has occurred 
in the strains from which the transformants are obtained, Southern hybridization and PCR 
analysis of the promoters is performed. The nucleotide sequences of the putative promoter 
sequences is then determined by sequencing the Thermus genomic fragment. Computer 
analysis of these sequences using BLASTN search algorithm is then utilized to reveal 
putative core promoter regions. The promoter strength of the discovered promoters is then 
analyzed using the methods described in Example 2 using either the same or a different 
reporter gene. 

Example 4 
Putative promoters from T. flavus 

To date, approximately 300 kb comprising about 20% of T. flavus genome has been 
sequenced by the applicants. A search for putative promoters within these sequences has 
been accomplished using Neural Network for Promoter Prediction software, NNPP. NNPP is 
a time-delay neural network consisting mostly of two feature layers, one for recognizing 
TATA-boxes and one for recognizing so called "initiators", which are regions spanning the 
transcription start site. Both feature layers are combined into one output unit. 

Nineteen putative promoters were identified in the 25 kb contig by NNPP (Table 6). 
Though a basis for a search by NNPP is -10 and + 1 regions, -35 boxes which were found for 
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most of identified promoters match consensus sequence TTGNCN denved from published 
data and from data obtained in the experiments described herein. 

Table 6 

Putative Promoter Sequences Predicted for T.flavus contig TF4-6-10.1 



. 35 -10 
ATGGCATTOTCTTTCCGCTATTGAATGACTATCATTCAAGTATGGAAAGA (SEQ ID NO: 26) 

GAGGTTGCTTGGTTCCGGTGCACGAGTTCTATTCTGCCCAGGCCGTAGCG (SEQ ID NO : 27) 
GAGTACGTTGACCAGCGCTCCCCGAAAGGTAIAAQCGGGCACGTAAAGCC (SEQ ID NO: 28) 

10 RCC TTGTCG TGCCTCGCCTTGAGGTAGAGGAACATGGCGTAGGGCTCCTG (SEQ ID NO: 2 9) 

CTAATCTOGAAGTAGGCCGGGTTCTTGGCGATQMCTCCCACACCAGCAC (SEQ ID NO: 30) 
GAC TTGCAQA ARCTTTTGGTAACCTGCCATAGCTTCTACCCTCCTCGTTC (SEQ ID NO: 31) 
TTTGTTAAAGGAAGCGAGCTITCCTCGCACATAACTCACCAGATTCAAAT (SEQ ID NO: 32) 
GCTCGCTTCCTTTAACAAAGGTGATCCGGTACTMAAAATCTGCAAGAGG (SEQ ID NO: 33) 
15 AACACGCATCTGATTGGCAGACCTTTTTCCAGAATAjraGTTGAAGACCGT (SEQ ID NO: 34) 

TATGACCGTGGATGAAGTCAGTACCTGGCCGCGGTCTTATGGGCACCTGG (SEQ ID NO: 35) 
GCAATCAGAATGTCAAGCAAAAATTGGAGTCGCTCAAAATCCCCGACTCC (SEQ ID NO: 36) 
CAGGTCTAGTTTGGCGACGCGAGGCTCAAGGGAATACCGTCCCGGACCGC (SEQ ID NO: 37) 

TTGGTTG^TCTTCGGCCAGAAAAGGGAAATAATCCCAGGTCATGCGCC (SEQ ID NO: 38) 
20 AACTGGTTTGAG^CGGCGCTTCATCTCGTCAAAGTCCACCAATCCCGGCT (SEQ ID NO: 39) 

GAAGTTTTGTAGCGAGACCCAAGAGAAATCATGMATGAGTGTGGTACTT (SEQ ID NO: 40) 
GGGAGGCCATCTTGTCTGGATTGTAGCACTTCCCTATCCTTAGCCCAAGG (SEQ ID NO: 41) 
GTGCGCCTATTTTGAGTTCTGCTTCGTGGAGGAGGAAGATGGCTAAGCCG (SEQ ID NO: 42 
ACCCCGGGGGGTTGA^GCACACCCCCCGATCTGCTAACTJGGCCTTAAGT (SEQ ID NO: 43) 
25 GACCAACAGCCATTG^CGCAAAGTACCACACTCATATCATGATTTCTCTT (SEQ ID NO: 44) 

The putative promoter sequences of SEQ ID NO: 2<hW are amplified by PCR using 
primers containing restriction sites compatible with the restriction sites available for cloning 
3 ' of the kanamycin sequence of the promoter-test vector P TG200. These fragments are then 
30 ligated with DNA from the promoter-test P TG200 vector which has been digested with an 
appropriate enzyme for cloning. The ligated DNA is then used to transform Thermus. The 
plating media contains a suitable amount of the selection (or screening) agent such as 
kanamycin in this example, which is determined by testing a series of concentrations and 
choosing one just strong enough to prohibit growth of ^transformed strains. This allows 

35 selection or screening for promoter sequences which activate the reporter gene. To confirm 
integration of the Km" gene into the T. flavus chromosome has occurred in the strains from 
which the transformants are obtained, Southern hybridization and PCR analysis of the 
promoters may be performed. Computer analysis of these sequences using BLASTN search 
algorithm may then be utilized to reveal putative core promoter regions. The promoter 

40 strength of the discovered promoters may then be analyzed using the methods described in 
Example 2 using either the same or a different reporter gene. 



24 



WO 01/18217 



PCTVUSOO/24430 



Examples 

Promoter Optimization for Gene Expression in a Thermophile 
To optimize the promoters found to be useful in driving gene expression in a 
thermophilic organism, promoter deletions constructs are generated. 

A. Promoter Optimization bv PCR 

One method to generate sequential deletions is by PCR amplification. A 1 kb 
thermophile genomic fragment that is observed to drive gene expression in a thermophilic 
organism is modified to generate subfragments at 100 nucleotide intervals by PCR. PCR 
primers are designed that correspond to regions of the 1 kb fragment as follows: bp 1-900, bp 
1-800, bp 1-700, bp 1-600, bp 1-500, bp 1-400, bp 1-300, bp 1-200, and bp 1-100. The 
primers also incorporate restriction enzyme sites suitable for insertion into pTG200 reporter 
vector. 

B. Promoter Optimization using an Exonuclease 

Exonucleases such as Exo III or Bal31 may also be used to generate sequential 
deletions of the promoter regions. These enzymes are employed using standard molecular 
biology techniques or as described by supplier of these enzymes (New England Biolabs) to 
generate a series of random deletions by reacting with a linearized plasmid or fragment of 
DNA. The exonucleases generate a time-dependant set of deletions from one or both ends of 
the linear DNA. In the case of Bal31, deletions at both ends of the linear fragment that has 
been treated are obtained and the deleted promoter sequence subcloned into an appropriate 
test plasmid such as pTG200. In order to do this, the deletion ends are repaired with Mung 
Bean Nuclease so that they are suitable for subcloning, digested with an appropriate 
restriction endonuclease which can be used to isolate the remaining promoter fragment, and 
religating into the test plasmid. In the case of exo m, it is possible to obtain deletions at only 
one end (for example the end of the linearized plasmid containing the promoter sequence) if 
approprate set of restriction endonucleases are used. Several kits are available to do this 
(such as the Exo-Size™ kit from New England Biolabs). Exo m prefers digestion of DNA 
which contains a blunt or 3' recessed ends, so if two appropriate endonucleases can be used to 
linearize the plasmid, then after repairing the deleted ends to make them suitable for cloning 
with Mung Bean Nuclease and optionally adding a linker DNA, the DNA can be directly 
religated and used for testing without the need for subcloning. The DNA from either method 
can be transformed into E. coli and a series of plasmids containing a series of deletion 
constructs from each end of the putative promoter can be identified. 

C. Testing the Constructs 

Following promoter modification as described above, constructs containing one or 
more of the various subfragments are made using standard molecular biology techniques such 
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that each subfragment has the potential to drive expression of the kanamycin resistance gene 
following transformation of the thermophile with the ligated reporter vector. The 
thermophilic organism is then assayed for kanamycin resistance by plating on kanamycin- 
containing plates. Growth in the presence of increasing amounts of kanamycin indicates that 
5 the subfragment is active in the thermophilic organism. 

Following identification of those subfragments having activity in the thermophile, the 
fragments may be combined by ligation or PCR amplification and co-inserted into pTG200 or 
another reporter vector. The ability of these combined subfragments to drive gene expression 
in the thermophile is then determined by the presence or absence of growth in the presence of 
10 kanamycin. In this manner, it is possible to identify those fragments that function additively 
or synergistically to drive high expression in the thermophile. 
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1 . £ An isolated, recombinant DNA molecule for identification of a regulatory region of a 
thermophile genome comprising: 

a) a reporter sequence; 

b) a putative thermophile promoter operably linked to said reporter sequence to form 
a promoter / reporter cassette; 

c) a drug resistance marker, and, 

d) a 3' and a 5' DNA targeting sequence that are together capable of causing 
integration of at least a portion of said DNA molecule into the genome of a 
thermophile; 

wherein said promote/reporter cassette is flanked by said 3' and said 5' DNA targeting 
sequences; and, said promoter/reporter cassette is positioned in the opposite 
orientation of the DNA targeting sequences. 

A recombinant DNA of claim 1 wherein said reporter sequence is tbg. 
A recombinant DNA of claim 1 wherein said reporter sequence is lacZ. 
A recombinant DNA of claim 1 wherein said drug resistantce marker confers upon a 
thermophile resistance to kanamycin. 

-A recombinant DNA of claim 1 wherein said putative promoter sequence is a 
fragment of the genome of a theimophile. 

A recombinant DNA of claim 5 wherein said fragment was isolated following limited 
digestion of the genome with a restriction enzyme. 

A recombinant DNA of claim 1 wherein said thermophile putative promoter sequence 
was isolated from a bacteria of the genus Thermus. 
25 8. A recombinant DNA of claim 1 wherein said thermophile putative promoter sequence 
was isolated from the thennophile Thermus flavus. 

A recombinant DNA of claim 1 wherein said thermophile putative promoter sequence 
was isolated from the thennophile Thermus thermophilus. 

A method of identifying a thermophile promoter comprising transforming a 
thermophile witharecombinantDNA molecule of claim 1 and detecting expression of 

the reporter sequence. 

1 1. A method of claim 10 wherein said thermophile is from the genus Thermus. 

12. A method of claim 10 wherein said thermophile is Thermus flavus. 

13. A method of claim 10 wherein said thennophile is Thermus thermophilus. 

14. An isolated DNA molecule comprising a promoter sequence identified by the method 
of claim 10. 

li, An isolated DNA molecule comprising a thermophile promoter sequence selected 
^ from the group consisting of SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID 
NO- 8 SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID 
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SEQUENCE LISTING 

<110> Peredeltchouk, Mikhail 
Vonstein, Veronika 
Demirjian, David 

<120> Thermus Promoters for Gene Expression 

<130> 99-559 

<140> 09/390,867 
<141> 1999-09-07 

<160> 44 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 87 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR primer 187 
<400> 1 

cacgtggtta cccgcctaat gagcgggctt ttttttgatc atacgtagct agccccggcc 60 
ggcctaggat ggcaattatt caatttc 87 

<210> 2 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR primer 227 
<400> 2 

ttaatattca aaccatttat tttctat 

<210> 3 
<211> 42 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR primer 442 
<400> 3 

tggttaccat atggtaacca cgtgaatgga ccaataataa tg 4 - 

<210> 4 

<211> 88 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR primer 388 

<400> 4 
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gttatctgaa agcggccgct ttcagataaa aaaaatcctt agctttcgct aaggatggat 60 
ttctggctca aaatggtatg gttttgac 88 

<210> 5 
<211> 71 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Wl-2 promoter 
sequence 

<400> 5 

ttgacattcc ccccgccccg gggtaccctc cttcccggga ggcgcgcctc ccgaggagaa 60 
cggtacccat g 71 

<210> 6 
<211> 76 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: W-57 promoter 
sequence 

<400> 6 

ttgacaaggg aaagccgggg tgctaactta gggattgcgc tgcccttacg tagctagccc 60 
cggccggcct aggatg 76 

<210> 7 
<211> 67 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: W34 promoter 
sequence 

<400> 7 

tttattcgca aagccccccg gtgctataat ggaagacggc gtctaaacgc cttctaggag 60 
cgctatg 67 

<210> 8 
<211> 65 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: W40 promoter 
sequence 

<400> 8 

ttgacgctcc cccaaaagcc cccttataat cgctgtggaa tagcttccaa aggaggtacg 60 
gtatg 65 

<210> 9 
<211> 63 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: PCR primer 
TR5KAN 

<400> 14 

acacacacac accatggcct aatgccaccc catgctggct tgcgccagca tgggggcccc 60 
ggcaaaagaa ttcaaaggga atgagaatag tgaatggacc 100 

<210> 15 

<211> 50 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR primer 
KAN 3 

•<400> 15 

gagcatggcc ctgcagaagc ttcaaaatgg tatgcgtttt gacacatcca 50 

<210> 16 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR primer 
3KM-RI 

<400> 16 

acacacgaat tccaaaatgg tatgcgtttt gacacatcc 3 9 

<210> 17 
<211> 50 
<212> DNA 

<213> /Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR primer 
5KM-H3 

<400> 17 

cacacacaaa gctttacgta tctagaggga atgagaatag tgaatggacc 50 

<210> 18 
<211> 97 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR primer W3 
7KM 

<400> 18 

acacacaagc ttgtagaggc ggcgctccgc ctctatggcc acccggatca tttaccccct 60 
catcaaggag gagaatagtg aatggaccaa taatgac 97 

<210> 19 
<211> 96 
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<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: PCR primer 
W53 KM 

<400> 19 

acacacacaa gcttgacaaa ggccatgcct ccttggtatc ttcccttttg cgctgccctg 60 
aggaggagaa tagtgaatgg accaataata atgact 96 

<210> 20 
<211> 96 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR primer 
W18 KM 

<400> 20 

acacacacaa gcttgacaag gtcttccgcc aggcctccat ccaccacgtc atcgtcctgg 60 
aggaggagaa tagtgaatgg accaataata atgact 96 

<210> 21 
<211> 96 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR primer 
W51 KM 

<400> 21 

acacacaagc ttcgaatccc tccgggcccg ccattgttat cttggaaatg ggtagccttt 60 
aggaggagaa tagtgaatgg accaataata atgact 96 

<210> 22 
<211> 100 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR primer 
W57 KM 

<400> 22 

acacacacaa gcttgacaag ggaaagccgg ggtgctaact tagggattgc gctgccctca 60 
tacgtaggag gagaatagtg aatggaccaa taataatgac 10 

<210> 23 
<211> 89 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: PCR primer 
W12D2 

<400> 23 
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acacacacaa gcttgacatt ccccccgccc cgccgtaccc tccttcccgg gaggaggaga 60 
atagtgaatg gaccaataat aatgactag 8 9 

<210> 24 
<211> 60 
<212> DKA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Wl-2 promoter 
sequence 

<400> 24 

ttgacattcc ccccgccccg gggtaccctc cttcccggga ggcgcgcctc ccgaggagaa 60 

<210> 25 

<211> 49 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: W1-2/D2 
promoter sequence 

<400> 25 

ttgacattcc ccccgccccg gggtaccctc cttcccggga ggaggagaa 49 

<210> 26 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 26 

atggcattgt ctttccgcta ttgaatgact atcattcaag tatggaaaga 50 

<210> 27 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 27 

gaggttgctt ggttccggtg cacgagttct attctgccca ggccgtagcg 50 

<210> 28 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 
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<400> 28 

gagtacgttg accagcgctc cccgaaaggt ataagcgggc acgtaaagcc 50 

<210> 29 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 29 

accttgtcgt gcctcgcctt gaggtagagg aacatggcgt agggctcctg 50 

<210> 30 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 30 

ctaatctgga agtaggccgg gttcttggcg atgatctccc acaccagcac 50 

<210> 31 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 31 

gacttgcaga aacttttggt aacctgccat agcttctacc ctcctcgttc 50 

<210> 32 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 32 

tttgttaaag gaagcgagct ttcctcgcac ataattcacc agattcaaat 50 

<210> 33 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 
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<400> 33 

gctcgcttcc tttaacaaag gtgatccggt actaaaaaat ctgcaagagg 50 

<210> 34 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 34 

aacacgcatc tgattggcag acctttttcc agaatattgt tgaagaccgt 50 

<210> 35 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 35 

tatgaccgtg gatgaagtca gtacctggcc gcggtcttat gggcacctgg 50 

<210> 36 

<211> 50 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 36 

gcaatcagaa tgtcaagcaa aaattggagt cgctcaaaat ccccgactcc 50 

<210> 37 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 37 

caggtctagt ttggcgacgc gaggctcaag ggaataccgt cccggaccgc 50 

<210> 38 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 
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<400> 38 

ttggttggtg tcttcggcca gaaaagggaa ataatcccag gtcatgcgcc 50 

<210> 39 

<211> 50 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 39 

aactggtttg aggcggcgct tcatctcgtc aaagtccacc aatcccggct 50 

<210> 40 

<211> 50 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 40 

gaagttttgt agcgagaccc aagagaaatc atgatatgag tgtggtactt 50 

<210> 41 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 41 

gggaggccat cttgtctgga ttgtagcact tccctatcct tagcccaagg 50 

<210> 42 
<2X1> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 42 

gtgcgcctat tttgagttct gcttcgtgga ggaggaagat ggctaagccg 50 

<210> 43 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 
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<400> 43 

accccggggg gttgacgcac accccccgat ctgctaactt ggccttaagt 

<210> 44 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Putative 
promoter sequence 

<400> 44 

gaccaacagc cattggcgca aagtaccaca ctcatatcat gatttctctt 
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